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Abstract. The yields from transit surveys can be used to constrain the 
frequency and statistical properties of extrasolar planets. Conversely, planet 
frequencies can be used to estimate expected detection rates, which are criti- 
cal for the planning and execution of these surveys. Here I review efforts to 



accomplish these two related goals, both of which generally require realistic sim- 



in 



ulations. Early attempts to predict planet yields generally resulted in overly 
optimistic detection rates that have not been realized. I point out where these 
estimates likely went wrong, and emphasize the strong biases and sensitivity 
to detection thresholds inherent in transit surveys. I argue that meaningful 
comparisons between observed and predicted detection rates require proper cali- 
bration of these biases and thresholds. In the few cases where this has been done, 
the observed rates agree with the results from radial velocity surveys for simi- 
| lar stellar environments. I then go on to describe recent, detailed calculations 

which should provide more accurate rates, which can be critically compared to 
observed yields. Finally, I discuss expectations for future all-sky synoptic sur- 
veys, which may have the sensitivity to detect hundreds or thousands of close-in 
transiting planets. Realizing the enormous potential of these surveys will require 
Cln' novel methods of coping with the overwhelming number of astrophysical false 

q positives that will accompany the planet detections. 
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Introduction 



There are two basic reasons for conducting transit surveys for extrasolar planets. 
The first, most obvious, and most familiar, it simply to find transiting planets in 
order that one can perform the host of follow-up studies that are possible with 
these systems. Such studies enable one to measure many otherwis e unobservable 
physical properties of the planets (see Charbonneau et al. 20061 and references 



therein). As these studies are best suited to bright systems, uncovering the 
transiting planets orbiting the brightest host stars is the primary motivation of 
many wide-angle photometric surveys, as well as several radial velocity surveys. 

The second reason to conduct transit surveys is to constrain the frequency of 
short-period planets as a function of mass, radius, and period. Since only ~ 10% 
of short-period planets transit their host stars, and the transit duty cycle is only 
~ 5%, one might wonder whether it is wiser to do this using a detection method 
which shows a persistent signal over a larger range of inclinations, such as pre- 
cision radial velocities. Of course this is being done, but currently photometric 
surveys can probe a larger number of systems over a larger range of distances 
from the Sun, and so can detect intrinsically rarer systems, or constrain the 
properties of planets in environments beyond the local solar neighborhood, such 
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as open clusters, globular clusters, the distant Galactic disk, or even the Galactic 
bulge. 

Proper planning and execution of transit surveys requires the ability to pre- 
dict the number of planets that will be detected, given a model for the frequency 
and distribution of planets, the properties of the survey, and the detection crite- 
ria. Conversely, these predictions are required in order to use observed detection 
rates from completed surveys to infer the intrinsic frequency and distribution 
of planets. Here I review efforts to accomplish these two related goals, both of 
which generally require realistic simulations of transit surveys. 

2. Predicting Planet Yields 

One can broadly divide the methods of predicting planet yields into two cate- 
gories. The first category, which I will call reverse (or a posteriori) modeling, 
uses the known properties of an observed sample of stars around which one is 
searching for planets to calibrate the survey efficiency. The second category, 
which I will call forward (or ab initio) modeling, uses assumed distributions of 
the stellar and planetary properties to statistically predict the ensemble prop- 
erties of the target stars and the detection efficiency of the survey as a whole. 
These two methods have different advantages and drawbacks, and their relative 
usefulness depends on the context in which they are applied. 

In the reverse approach, one attempts to model, as accurately as possible, 
the total detection probability of each individual star in the survey sample. The 
expected number of detections is then just the sum of the detection probabilities 
over all the stars in the survey. For example, given a known mass M k and 
radius R k for each star k, one can determine the individual transit probabilities 
Ptr,k- These stellar properties can be combined with the properties of the survey 
observations (cadence, photometric uncertainties, correlations) to determine the 
detection probability for each star Pdct,fc- Then, assuming a distribution df /drdP 
of planets as function of period P and planet radius r, the (differential) number 
of expected planet detections, (N), is given by, 



The advantage of the reverse approach is that it is more accurate and ac- 
counts for Poisson fluctuations in the individual stellar properties. The disadvan- 
tage is that it requires knowledge of the properties of the individual stars. This 
is relatively straightforward to obtain for surveys toward stellar systems such as 
globular or open clusters, but can be quite difficult to obtain for field surveys, 
for which the physical properties of the individual stars are poorly constrained 
due to their unknown distances and foreground extinction. Furthermore, the 
reverse approach is not generally applicable for predicting the expected yields 
of future surveys. 

In the forward approach, one dispenses with any hope of modeling the 
detection probabilities of the stars individually, but rather attempts to construct 
statistical distributions of the stellar properties, and use these to predict the 
ensemble detection probability of the stars in the survey. The average number of 



d(N) _ df(r,P) 



]T PtP )fc (M fc , R k , P)P detik (M k , R k ,r, P). 
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planets that a transit survey should detect is the product of the differential stellar 
density distribution along the line-of-sight (dn/dMdRdL), the distribution of 
planets (df/drdP), the transit probability, and the detection probability: 

d{N) dn(£,M) df(r,P) n mn /AT 2 



-Ptr(M, R, P)P det (M, R, L, r, P)i 2 



dMdRdLdrdPdldn dMdRdL drdP 

(2) 

Here L is the stellar luminosity, I is the distance along the line-of-sight, and thus 
£ 2 d£d£l represents the volume element. 

Of the ingredients that enter into equation (|2|), the detection probability 
is one of the most critical and must be specified with care. In particular, the 
detection probability must account for all selection cuts that are employed in 
the actual surveys, such as cuts on parameters output from transit-finding algo- 
rithms3, the number of transits, the source color and magnitude, and the radial 
velocity precision (as used for confirmation). When comparing the predictions 
of surveys to results from completed experiments, the cuts must be applied con- 
sistently in the data and the model, otherwise any inferences are highly suspect. 
This can be particularly difficult when trying to model any 'by-eye' cuts. 

The forward approach is well-suited to the planning of future surveys, and 
field surveys where the masses and radii of the individual stars are not known. 
The drawback of the forward approach is that it is less exact, requires consider- 
ably more input assumptions, and is sensitive to uncertainties in these assump- 
tions. One can improve the accuracy of the forward approach by adopting a 
hybrid method in which one imposes external observational constrains on the 
model. For example, one of the most important indicators of the expected num- 
ber of detections is simply the total number of stars being surveyed. Therefore, 
observed number counts of stars as a function of magnitude and color can be 
used to constrain the parameters of the forward model and so make the predicted 
yields more reliable. 

2.1. Simple Estimates Fail 

The previous discussion notwithstanding, at first glance transit surveys appear 
fairly straightforward, and the requirements for detecting a transiting planet 
may seem reasonably clear. Given that the transit probability for short-period 
planets is Pt ~ 10%, given that transits are expected to have a depth of 8 ~ 1%, 
and given that the frequency of short period planets is known to be / ~ 1%, 
then one might expect the number of detected events to be 

(N) ~ fPtrN< 1% ~ 10~ 3 N< 1% , (Naive Estimate), (3) 

where A r < 1 % is the number of surveyed stars with photometric precision better 
than a ~ 1%. Early estimates of the yield of transit surveys varied in complexity, 
but generally centered around this extreme simplification of equation ([5]). This 
estimate would imply that one would need to monitor only ~ 10 3 stars with 
cr <^ 1% for a duration of 2P ~ 6 days in order to detect a transiting planet. In 
fact, results from successful surveys imply that the number of stars that must 
be monitored is closer to ~ 10 5 , depending on the survey. 



1 For example, the a and SDE parameters in the BLS algorithm (jKovacs et al.|[2002f ). 
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Why does this simple estimate fail? There are many reasons, but the most 
important include the following: (1) A large fraction (often the majority) of the 
stars in the field are either giants or upper main-sequence stars that are too large 



to enable the detec tion of transits due to Jupiter-sized planets (| Gould fc Morgan 



20031 : iBrownl 120031 1. (2) 1% photometry is not a sufficient requirement for de- 



tecting a transit, one typically needs to exceed some sort of signal-to-noise ratio 
(S/N) requirement. The S/N in turn depends on the depth of the transit, the 
photometric accuracy, and the number of points taken during transit. Fur- 
thermore, uncertainties in ground-based photometry can be correlated on the 
typical time scales of transits, thereby reducing the statistical power of the data 
(jPont et al.l 12006). (3) One generally requires several transits for detection, 



which when combined with the small duty cycle of the transits and losses in 
single-site observations, can result in a substantial suppression of the number of 
planets detected. (4) Magnitude-limited radial velocity surveys are more biased 
toward metal-rich (and so planet-rich) stars than S/N-limited transit surveys. 
Therefore the frequency of planets is likely to be substantially small er than 1% 
for the typical stars monitored in transit surveys (|Gould et alJ l200d ) . 



2.2. Selection Effects are Critical 

One key point that must be addressed when simulating transit surveys is what 
it means to 'detect' a transiting planet. Typically transit surveys use a number 
of criterion to select transit-like features from observed light curves, but most 
trigger on some variant of a S/N criterion^. For uncorrelated noise, the S/N is 
approximately given by, 



— ~ NJ — 

N tT a 



where Nt r is the total number of points in transit. Under simple assumptions 
(Poisson-noise limited photometry of the source, no extinct ion, random sam- 
pling) this c an be written in terms of physical parameters as ( Gaudi et al. 20051 : 
lGaudill2005l 1. 

^ oc [R-^M-^L 1 ' 2 ] [r'p- 1 ^} r\ (5) 

Given a minimum (S/N) m i n , one can invert this equation to determine the dis- 
tance £ max out to which one can detect a given planet orbiting a given star. 

Th e expected number of planets detected around a uniform population of 
stars is ( Pepper et al.1 [20031 ). 



(N) ~ -niV^, (6) 

where n is the volume density of stars, and Q is the area of the field-of-view. At 
a limiting S/N, it is then relatively straightforward to show that, 

(N) oc p- 5 / 3 r 6 (—) 3 . (7) 



N 



mm 



2 For example, the a parameter in the BLS search algorithm (|Kovacs et al.|[2002f ) is often used 
to select the best transit candidates, and is closely related to the S/N. 
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Although trivial to derive, this expression has profound implications for tran- 
sit surveys. For example, it implies that S/N-limited transit surveys are ~ 
(1/3)- 5 / 3 ~ 6 times more sensitive to planets with P ~ 1 day than planets with 
P ~ 3 days. This effect alone largely explains why the first planets uncovered by 
transit s urveys had period s that were shorter than any found in radial velocity 
surveys ( Gaudi et al.ll2005l ). Also, the extremely strong scaling with planet ra- 
dius, oc r 6 , implies that transit surveys will always detect the extreme, bloated 
planets first; this bias must be pr operly consid ered when interpreting the radius 
distribution of observed planets ( Gaudil 12005). Finally, the number of detected 
planets is expected to be a strong function of the limiting S/N, and thus survey 
teams must carefully specify their S/N limit in order to assess their expected 
yield. This is not necessarily trivial because, although they may initially use 
automated cuts with rigid thresholds to select candidates, subsequent rejection 
of candidates by visual inspection imposes a higher (and more difficult to model) 
S/N threshold. 



3. Constraints on the Frequency of Planets 

A number of groups have used transit surveys to measure or constrain the fre- 
quency of short-period giant planets based on the results of completed surveys. 
The majority of these studies have focused on cluster environments, where it is 
easier to determine the properties of the target stars. 



The first such study was the now-famous HST survey of 47 Tuc (Gillil and et al 



2000), which found no transiting planets and used this null result, combined 



with an estimate of the survey efficiency, to demonstrate that the frequency of 
short-period planets was more than an order of magnitude smaller than in the 
local solar neighborho od. This conclusion was confirmed and strengthened by 
Weldrake et al. ( 2005) , who further argued that metallicity was the likely cause 



for this difference in the planet population. 

A number of groups have searched for planets in open clusters, without 
success. Several of these groups have calibrated their detection efficiencies and 
used these to place weak constraints on frequency of short-period planets in 
these systems ( Mocheiska et al.ll2005l . l2006l ; lBramich Hornell2006l ; iBurke et~aT 



20061 ). 

Few groups have attempted to use the results from field surveys to constrain 
the frequency of short period planets. As mentioned previously, field surveys are 
complicated by the fact that the source stars are located at a range of distances 
and suffer from a range of extinctions, and therefore the relevant parameters 
of the target stars (e.g. their radii and mass) are not known simply from their 
observed fluxes and colors. Because of this, forward modeling provides the best 
method of estimating the effici encies of field surve ys. 

In a comprehensive study, iGould et al.l ( 20061 ) mo deled the expected y ield 



of th e first two campaigns of the OGLE transit survey (lUdalski et al.ll2002al lblfl 



2003), taking careful account of the survey selection effects, and using a detailed 



model for the population of source stars. They then used the five planets de- 
tected in the OGLE survey to infer that the fraction of stars with planets is 
(1/710) x l±o;54 for P = 1-3 days and (1/320) x for P = 3-5 days, con- 

sistent with the results from RV surveys. They noted, however, that magnitude- 
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Figure 1.: The distribution of predicted TrES detections in the Lyrl field as a 
function of the primary mass (Top Left), radius (Top Right), distance (Bot- 
tom Right), and absolute ^-magnitude (Bottom Left) . The solid lines are 
for uncorrelated uncertainties, the dashed-dot line is for uncertainties correlated 
at the 0.3% level. The location of the stars TrES-1 and -2 are als o shown. TrES- 2 
lies w ithin Lyrl, while TrES-1 is in the nearby LyrO field. From lBeattv Gaudi 
(j2007r ). 



limited RV surveys are biased toward metal-rich (and so planet-rich) stars, while 
transit surveys are not, therefore one would generally expe ct to find a d e ficit o f 
planets in transit surveys in comparison to RV surveys. iGould et all (|2006h 
also demonstrated that the sensitivity of the OGLE surveys declined rapidly for 
r ^ Rjup, indicating that little can be said about the frequency of sub- Jovian 
sized planets. 



4. Predictions for Ongoing and Future Surveys 



Several authors have developed model s of th e expe cted yields of transit sur- 
veys, with various levels of complexity. iHornd ( 20031 ) derived a simple analytic 
expression for the yields of pen cil-beam trans i t surv eys and applied this to sev- 
eral ongoing projects, whereas iPepper et al.l ( 20031 ) pr esented and appl i ed the 
formalism for estimating the yields of all-sky surveys. IPepper fc Gaudi! (|2005T ) 
developed a model to predict the number of detected planets in s urveys of stel- 
l ar sys t ems; thi s was significantly extended and improved upon bv lAigrain et al.l 
(2003). 1 

rownl (|2003l ) in cluded, for the first time, expectations for the rates of 
false alarms as well, and iGillon et al.l ( 20051 ) used a detailed model to estimate 
and compare the pot e ntial of several space-based and ground-based surveys. 
Beattv k, Gaudil ( 20071 ) attempt to build upon and advance these previous 



studies, accounting for as many real-world effects as possible, including the vari- 
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Table 1.: Predicted TrES Yields (R < 13 and r = lR Jup ) 



Period 


1 — 3 days 


3 — 5 days 


Both 


S/N> 10 


3.96 


4.47 


8.43 


S/N> 15 


3.23 


3.30 


6.53 


S/N> 20 


2.53 


2.35 


4.88 


S/N> 25 


1.94 


1.66 


3.60 


S/N> 30 


1.49 


1.19 


2.68 



ation of the stellar density along the line-of-sight, a S/N detection criterion, 
various noise sources (source, sky, scintillation, saturation, uncorrelated and 
correlated^ uncertainties), apparent magnitude limits, the stellar mass function, 
the magnitude-scale height relation, requirements on the minimum number of 
detected transits, and arbitrary bandpasses. 

Figure [T] shows predictions for the number of transiting planets detected 
by th e TrES survey (|Dunham et alj|2004l : lAlonso et all 12004 ; lO'Donovan etH 
2006) toward one of their t arget fields in Lyra, assuming the frequencies of giant 



planets from I Gould et al. ( 20061 ). The field center, number of observations, and 
photometric errors were taken from the TrES websitH For this field, (N) = 0.6 
detections are expected for planets with r = 1.0Rj up for R < 13 and S/N > 20, 
assuming uncorrelated noise. Assuming noise correlated at the level of ~ 0.3%, 
the number of detections generically drops by a factor of ~ 2. Over the ~ 10 
TrES fields which have been exhausted for planets, a total of (N) ~ 5 detections 
are expected in this model. This compares reasonably well with the actual 
yield of two detections. The remaining discrepancy could be because correlated 
noise is important, or the effective S/N is higher. Table [T] shows the number 
of detections as a function of the limiting S/N. For example, for S/N > 30, 
the number of detections drops to (N) ~ 2.7. This further demonstrates the 
important point that, in order to use the observed yield of TrES (or any other 
transit survey) to infer the frequency of short period planets, it is essential to 
accurately characterize the limiting detection threshold. 

One important ingredient that is missing in most of the previous simula- 
tions is the ability to predict the rate of false positives, including grazing eclipsing 
binaries, unrelated blends, and hierarchical triples. The rates for these astro- 
physical false positives are expected to be even higher than the r ate of b o na fid e 
planet detections. The one exception is the detailed model of iBrown ( 20031 ). 
however this model did not include some of the important effects considered by 
others, such as density variations due to Galactic structure or a S/N detection 
criterion. A more sophisticated model that incorporates all the relevant effects 
is needed. 



3 See lPont et"al] l|2006l ). 

4 http: //www. astro. caltech.edu/~ftod/tres/sleuthObs. html 
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Figure 2.: The cumulative number of transiting, Jupiter-sized, short-period 
(P = 1 — 5 days) planets around Sun-like stars as a function of Ga lactic latitude, 
for various limiting V mag nitudes. From lBeattv fc Gaudil (|2007l ). 



4.1. The Potential of All-Sky Synoptic Surveys 

One difficulty with trying to understand the trends that are emerging among the 
population of transiting planets is simply the small size of the sample. Ideally, 
one would like to be able to subdivide the sample and ask, e.g., how the trends 
depend on the mass of the primary star. For such analyses, a sample size of a 
least hundred transiting planets will be required. This an order of magnitude 
larger than the number of transiting planets known today. Figure [2] shows the 
cumulative number of transiting, Jupiter-sized, short-period (P = 1 — 5 days) 
planets orbiting solar type stars as a function of Galactic lat itude for several 
differe nt limiting V magnitudes, based on the simulations of iBeattv Gaudil 
(2002|). There are (N) ~ 200 such planets over the whole sky down to V = 
12. Unfortunately, current wide- field surveys are unlikely to survey a sufficient 
fraction of the sky to detect more than few dozen of these, given that the most 
ambitious of these projects only monitor ~ 10% of the sky. 

Increasing the number of known transiting planets by an order of magnitude 
will likely require a fundamentally different approach, or at least a significant 
upgrade to the current experiments. Plans are being made to this end, but it is 
interesting to ask what the potential is to detect transiting planets in the large 
scale synoptic surveys that are being currently being built or planned (and that 
are not specifically designed to detect transiting planets). 

An accurate estimate of the yield of large synoptic surveys requires careful 
simulations. This can be difficult, since in some cases the relevant parameters 
of the experiments have not been finalized. However, we can make a crude 
estimate by combining the predictions from Figure [2] with a rough estimate 
of the limiting magnitude of the survey. Assume a given setup with diameter 
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Table 2.: Predicted Yields for Large Synoptic Surveys (r = Rj up ,P = 3 days) 





Sun- 


like 


M- dwarfs 




Vlim 


(N) 


vlim 


(N) 


LSST 


18.5 


7740 


23.1 


15530 


SDSS-II 


15.6 


6.0 


20.2 


11.9 


Pan-STARRS 


15.0 


19.2 


19.6 


36.5 


Pan-STARRS Wide 


12.5 


48.0 


17.1 


81.6 



D = Dq can achieve a (source-noise limited) precision of u = gq on a star with 
V = Vq with an exposure time of to- Then, for uncorrelated noise, the limiting 
magnitude of stars around which a survey can detect transiting planets with 
S/N > (S/N) min is, 



Vi 



Urn 



5 log 



eTQ R\^ 2 D 5 
to 6 ira J Dq (Tq 



+ V 



(8) 



where T is the total duration of the experiment, is the total area surveyed, 
0, is the field-of-view of the camera, and a is the semimajor axis of the planets, 
and e is the total survey efficiency (fraction of T spent exposing). 

Table [2] shows the number of detected short-period, Jupiter-si zed planets 



detected at S/N > 20 for four surveys: the SDSS-II supernova survey (ISako et al 
l2005l h the Pan-STARR£0 medium-deep survey, a Pan-STARRS survey with 
the same specifications as the medium-deep survey, but covering 10 times the 
area (with the same amount of time), and LSST assuming a 10-year survey 
covering 20,000 deg 2 . Predictions are shown for both sun-like stars and M dwarfs 
(which can be detected to much fainter magnitudes because the transit depths 
are larger). The potential of SDSS-II and the nominal Pan-STARRS survey are 
small, due primarily to the fact they are targeted toward the Galactic poles and 
are very deep, and so 'run up' against the finite Galactic scale height. On the 
other hand, a wider Pan-STARRS survey spending the same amount of time 
would detect at least twice as many planets. 

The greatest potential comes from LSST, which could detect as many as 
~ 8000 transiting short-period planets around Sun-like stars, and ~ 15, 000 tran- 
siting planets around M-dwarfs, assuming the frequency of short-period planets 
around M-dwarfs is the same as for Sun-like stars. Of course, culling all of 
these planets poses enormous challenges. Simply identifying the transiting plan- 
ets themselves will be difficult, due to the large number of trial periods that 
must be searched given the 10 year duration of observations. More worrisome, 
however, is the fact that these detections will likely be associated with a much 
larger number of astrophysical false positives. The large number of expected 
candidates, combined with the fact that the majority of the candidates will be 
quite faint (V ;> 16), implies that it will be difficult to follow up every candidate 
individually to exclude false positives. 



http://pan-starrs.ifa.hawaii.edu, 
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Clearly much work needs to be done to realize the potential of large scale 
synoptic surveys for the detection of transiting planets. The first step is to pro- 
vide detailed simulations of the yield of these large surveys. These simulations 
must include predictions for false positives, in order to determine their expected 
contribution, as well as identify observable trends that distinguish these astro- 
physical backgrounds from the signal due to the planet population of interest. 
Then, it needs to be determined whether methods can be devised to use these 
trends to reliably separate the false positives from the bona fide transiting plan- 
ets, and so enable one to construct a statistical sample of short-period transiting 
planets. 
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